-
Notifications
You must be signed in to change notification settings - Fork 245
DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #1828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1 @@ | |||
{"field1":"miNVpaKW","field2":"CS5VwrwN","field3":"Oq5Csk1w","field4":"ZPm57dhu","field5":"gxUpzIjg","field6":"Smo9whci","field7":"TW34kfzq","field8":55336395,"field9":41992681,"field10":72188733,"field11":46660880,"field12":3527055,"field13":74094448} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format
|
||
### Benchmark Server | ||
|
||
The MongoDB ODM Performance Benchmark must be run against a standalone MongoDB server running the latest stable database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can open up this to be a standalone or a replica set with a size of 1. (This is because some ODMs leverage transactions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a replica set of size 1 makes more sense here, agreed.
|
||
### Benchmark placement and scheduling | ||
|
||
The MongoDB ODM Performance Benchmark should be placed within the ODM's test directory as an independent test suite. Due |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should leave an option for folks to create their own benchmarking repo if that helps out. I'm open to others take on this one seeing as I worry about maintainers not wanting a benchmark repo.
to the relatively long runtime of the benchmarks, including them as part of an automated suite that runs against every | ||
PR is not recommended. Instead, scheduling benchmark runs on a regular cadence is the recommended method of automating | ||
this suite of tests. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per your suggestion earlier, we should include some new information about testing mainline usecases.
As discussed earlier in this document, ODM feature sets vary significantly across libraries. Many ODMs have features | ||
unique to them or their niche in the wider ecosystem, which makes specifying concrete benchmark test cases for every | ||
possible API unfeasible. Instead, ODM authors should determine what mainline use cases of their library are not covered | ||
by the benchmarks specified above and expand this testing suite with additional benchmarks to cover those areas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is attempting to specify that ODMs should implement additional benchmark tests to cover mainline use cases that do not fall into those included in this specification. One example would be the use of Django's in
filter operator: Model.objects.filter(field__in=["some_val"])
.
### Benchmark Server | ||
|
||
The MongoDB ODM Performance Benchmark must be run against a MongoDB replica set of size 1 running the latest stable | ||
database version without authentication or SSL enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we concerned at all about accounting for performance variation due to server performance differences? In the drivers, we keep the server version patch-pinned and upgrade rarely and intentionally via independent commits in order to ensure that our performance testing results are meaningful and are only reflective of the changes in the system under test (the driver, or, in this case, the ODM). If the goal is only to track the performance of ODMs relative to each other and relative to the corresponding drivers, is the intention to have the drivers also implement these tests against the latest server so that we could get that apples-to-apples comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we concerned at all about accounting for performance variation due to server performance differences?
From the Django implementation:
This is NOT intended to be a comprehensive test suite for every operation, only the most common and widely applicable
@NoahStapp and @Jibola are working on this project for DBX Python (although I am reviewing the implementation PR), so this is just a drive by comment from me, but my impression is that the spec is at least initially focused on getting all the ODMs to agree on what to test.
In the drivers, we keep the server version patch-pinned and upgrade rarely and intentionally via independent commits in order to ensure that our performance testing results are meaningful and are only reflective of the changes in the system under test (the driver, or, in this case, the ODM). If the goal is only to track the performance of ODMs relative to each other and relative to the corresponding drivers, is the intention to have the drivers also implement these tests against the latest server so that we could get that apples-to-apples comparison?
One more drive by comment: I'd expect each ODM to "perform well" under similar server circumstances (testing the driver is a good call out!) but I'm not sure apples-to-apples is the goal. If other ODMs test their performance using the spec and can demonstrate "good performance" and/or catch performance issues they would otherwise have missed, that would indicate some measure of success to me in the spec design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose latest stable
server version here for the following reason: we've made server performance an explicit company-wide goal. When users experience performance issues on older server versions, one of the first things we recommend is that they upgrade to a newer version. At least in the Python driver, we only run performance tests against 8.0. Using the latest stable version ensures that our performance tests always take advantage of any server improvements and isolate performance issues in the ODM or underlying driver.
Implementing these same tests in the driver for a direct apples-to-apples comparison is a significant amount of work. Several of the tests here use similar datasets as the driver tests for easier comparison, so using the same version of the server as the driver tests to reduce differences could be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the latest stable version ensures that our performance tests always take advantage of any server improvements and isolate performance issues in the ODM or underlying driver.
I think we should be careful about our goals here: if it is to take advantage of any server improvements and track performance explicitly relative to the most current server performance, then this approach is fine. However, this approach will not isolate performance issues in the ODM or driver because: 1) server performance is not guaranteed to always improve in every release for every feature: the overall trends of the server performance for most features will hopefully keep moving up, but between releases there may be "acceptable" regressions to certain features that are considered a tradeoff to an improvement in another area, and 2) server performance improvements could mask ODM regressions that happen concurrently with the server upgrade. We should be explicit about accepting both of these risks if we are going to move forward with this approach (i.e., note this somewhere in the spec text).
Please complete the following before merging:
clusters).
Python Django implementation: mongodb/django-mongodb-backend#366.